TASK
1. Standard Error of Estimate:
The standard error of estimation is an estimated standard deviation of the error term u. It is also known as the standard error of the regression. The standard error of estimate shows the variation of observations. It is applied to inspect the accuracy of the estimation made. The standard error of estimate tells the accuracy of the estimated figures.
The formula of the standard error of estimate: sqrt(SSE/(n-k))
Following is the calculation of the Standard Error of Estimate:
SSE |
25843.41 |
k |
2 |
n |
400 |
N-k |
398 |
SSE/(n-k) |
64.9331909548 |
sqrt(SSE/(n-k)) |
8.058113362 |
If standard error is small, the data will be more representative of the true mean. In cases where the standard error is large, the data may have some notable irregularities.
2. Coefficient of Determination:
The coefficient of determination is a statistical analysis that determines the explanation of the model and estimated future outcomes. It shows the level of related variability in the data set. The coefficient of determination refers to R-squared and is applied to determine the correctness of the model. The coefficient of determination tells that variables in a given model are a certain percentage of the observed variation. It is represented as a value between 0 and 1. The closer the value is to 1, the better the fit, or relationship, between the two factors. Thus, if the R square is equal to 0.2672, then approximately less than half of the observed variation can be explained by the model.
Formula of Coefficient of determination: MSS/TSS = (TSS − RSS)/TSS
Where MSS is the model sum of squares, RSS is the residual sum of squares and TSS is the total sum of squares associated with the outcome variable.
We Promise Exceptional Assignment Writing & No AI Shortcuts !
3. The Adjusted Coefficient of Determination for a Degree of Freedom:
Adjusted coefficient of determination is best for a model with several variables, such as a multiple regression model. Adjusted R-squared provides the percentage of variation interpreted by only those independent variables that in reality affect the dependent variable.
Following is the calculation of the Adjusted coefficient of determination:
Adjusted coefficient of determination = |
|
|
1-(1-0.2672)[(400-1)/400-(2+1)] |
||
0.2635083123 |
4.Overall Utility Level of Model:
ANOVA table basis of calculation: |
|||||
|
df |
SS |
MS |
F |
|
Regression |
k |
SSR |
MSR = SSR/k |
F=MSR/MSE |
|
Residual |
n-k-1 |
SSE |
MSE= SSE/(n-k-1) |
|
|
Total |
n-1 |
SST |
|
ANOVA |
|||||
|
df |
SS |
MS |
F |
Significance F |
Regression |
2 |
9421.58 |
4710.79 |
72.37 |
0 |
Residual |
397 |
25843.41 |
65.1 |
|
|
Total |
399 |
35264.98 |
|
|
|
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
Intercept |
93.8993 |
8.0072 |
11.7269 |
0 |
78.1575 |
109.641 |
X1 |
0.4849 |
0.0412 |
11.7772 |
0 |
0.404 |
0.5659 |
X2 |
-0.0229 |
0.0395 |
-0.5811 |
0.56 |
-0.1005 |
0.0546 |
From the above Excel outputs, the value of the test statistic for testing the overall utility of the model is F = 72.37, The output also includes the P-value of the test, which is 0.00
As p-value = 0.00 < 0.05 = alpha, hence this model is useful at 5% level of significance.
5. Interpretation of the Coefficients:
Regression coefficients exhibit fluctuation in mean in the response variable for one unit of change in the predictor variable while holding other predictors in the model constant. This is important because it differentiates the role of one variable from all of the others in the model.
Coefficients help to determine whether there is a positive or negative correlation between each independent or dependent variable. A positive coefficient indicates that with an increase in the value of the independent variable, the mean of the dependent variable also leads to an increase. A negative coefficient suggests that with the increase in the independent variable, the dependent variable leads to a decrease. The coefficient value shows the extent to which the mean of the dependent variable changes given a one-unit shift in the independent variable while holding other variables in the model constant.
6. Relationship between Heights of Sons and Fathers:
The regression line for showing a relationship between the heights of the sons and the fathers is:
y = b0 + b1
y = 93.8993 + 0.4849x
If there is no linear relationship between these variables, then b1=0. If there is a linear relationship, then b1≠ 0. Hence, these data allow the statistician to infer that the heights of sons and fathers are linearly related.
7. Relationship Between Heights of Sons and Mother:
The regression line for showing the relationship between the heights of the sons and the fathers is:
y = b0 + b1X
y = 93.8993 + (-0.0229) x
If there is no linear relationship between these variables, b1=0. If there is a linear relationship, then ≠ 0; hence, these data allow the statistician to infer that the heights of sons and mothers are linearly related, but there is a negative correlation.